An empirical analysis of the probabilistic K-nearest neighbour classifier

نویسندگان

  • S. Manocha
  • Mark A. Girolami
چکیده

The probabilistic nearest neighbour (PNN) method for pattern recognition was introduced to overcome a number of perceived shortcomings of the nearest neighbour (NN) classifiers namely the lack of any probabilistic semantics when making predictions of class membership. In addition the NN method possesses no inherent principled framework for inferring the number of neighbours, K, nor indeed associated parameters related to the chosen metric. Whilst the Bayesian inferential methodology underlying the PNN classifier undoubtedly overcomes these shortcomings there has been to date no extensive systematic study of the performance of the PNN method nor any comparison with the standard non-probabilistic approach. We address this issue by undertaking an extensive empirical study which highlights the essential characteristics of PNN when compared to a cross-validated K-NN.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimal weighted nearest neighbour classifiers

We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted k-nearest neighbour classifier depends asymptotically only on the dimension d of the feature vec...

متن کامل

A Variable Metric Probabilistic k-Nearest-Neighbours Classifier

12:06 30th March 2004 Abstract The k -nearest neighbour (k -nn) model is a simple, popular classifier. Probabilistic k -nn is a more powerful variant in which the model is cast in a Bayesian framework using (reversible jump) Markov chain Monte Carlo methods to average out the uncertainy over the model parameters. The k -nn classifier depends crucially on the metric used to determine distances b...

متن کامل

Generating Estimates of Classification Confidence for a Case-Based Spam Filter

Producing estimates of classification confidence is surprisingly difficult. One might expect that classifiers that can produce numeric classification scores (e.g. k-Nearest Neighbour or Naive Bayes) could readily produce confidence estimates based on thresholds. In fact, this proves not to be the case, probably because these are not probabilistic classifiers in the strict sense. The numeric sco...

متن کامل

An experimental comparison of neural and statistical non-parametric algorithms for supervised classification of remote-sensing images

An experimental analysis of the use of different neural models for the supervised classification of multisensor remote-sensing data is presented. Three types of neural classifiers are considered: the Multilayer Perceptron, a kind of Structured Neural Network, proposed by the authors, that allows the interpretation of the network operation, and a Probabilistic Neural Network. Furthermore, the k-...

متن کامل

Scene Classification Via pLSA

Given a set of images of scenes containing multiple object categories (e.g. grass, roads, buildings) our objective is to discover these objects in each image in an unsupervised manner, and to use this object distribution to perform scene classification. We achieve this discovery using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature, here ap...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition Letters

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2007